COMPACTLY GENERATING ALL SATISFYING TRUTH 
ASSIGNMENTS OF A HORN FORMULA 

Marcel Wild 



Abstract : As instance of an overarching principle of exclusion an algorithm is presented 
that compactly (thus not one by one) generates all models of a Horn formula. The principle 
of exclusion can be adapted to generate only the models of weight k. We compare and 
contrast it with constraint programming, 0, 1 integer programming, and binary decision 
diagrams. 
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1. Introduction 

This introduction soon jumps into medias res with a concrete example of a Horn formula 
on six variables, and a (4 x 6)-table that complactly represents all its models (= satisfying 
truth value assignments) . We then indicate applications that benefit from the possibility 
to efficiently produce or count all models X of a Horn-formula; or alternatively only the 
X's with |X| = k for some prescribed integer k. Afterwards the section break-up displays 
the article's fine structure. Concepts only sloppily defined in the introduction will be 
formally defined in Section 2. 

So consider this Horn formula (f — (fi{ai, • • • , qq): 

ip :— (oi V 02 V as V 05) A (oi V 02 V 03 V ag) A (03 V 04 V 05 V Oe) A (oi V 03 V clq) 

Setting := False and 1 := True one verifies that say t := (01,02,03,04,05,06) — 
(1, 0, 1, 1, 0, 0) is a model of ^, that is, ip{t) — 1. The following table compactly represents 
all models of </?: 
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Table 1 



Each of the rows pi represents several models of (p. Namely, a label 2 indicates that the 
corresponding entry is free to be or 1, and the wildcard nn means that at least one 
must be present there (thus nn = 11 is forbidden). For instance, the model (1, 0, 1, 1, 0, 0) 
from before belongs to ps (here nn = 10). Viewed as set systems the rows pi happen to 
be mutually disjoint, and so the number N of models evaluates to 

A^ = |Pi| + IP2I + IpsI + |P4| =32 + 4- 3 + 3 + 2 = 49 
Also the number A^' of 4-element models (say) is conveniently calculated as 

AT' = 10 + 5 + 2 + = 17 
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Generating all models from Table 1 is just as easy but necessarily more time and space 
consuming. 

As to applications, getting all models of a Horn formula comprises the following special 
cases: 

(i) Get all sets of a closure system C from an implicational base E of C 

(ii) Get all sets of a simplicial complex J' from a negative-clause base of J'. 

Concerning (i), consider the closure system C of all subsemigroups of a semigroup {W,o). 
Here an implicational base of C is obtained as the family E of all "imphcations" {a, b} — > 
{aob.bo a} where a, b range over W. Akin to general implicational bases that means that 
a subset X CW is closed (i.e. a member of C) if and only if it is J]- closed in the sense that 
whenever a "premise" {a, b} happens to be contained in X, then also the "conclusion" 
{a o b,b o a} must lie in X. Using the algorithm of Section 5, our closure system C can 
be compactly generated "chunk- wise" as in Table 1, rather than one-by-one. Not just 
semigroups, all finite universal algebras can be dealt with this way. As another example. 
Formal Concept Analysis [GW] is a data mining method that revolves around a closure 
system C, called formal concept lattice, for which it is harder to find an imlicational base 
E. The original 1984 algorithm of Canter produces C and E simultaneously. It has been 
improved a number of times [KOV] but still is bound to the one-by-one generation of C. 

Concerning (ii), a set of sets A* C W is a negative- clause base of a simplicial complex 
J on W, if for all X C W it holds that : X belongs to J if and only if X ^ ^* for 
all y4* G 0. An efficient way for calculating \J'\ from 0, or more specifically the face 
numbers fk :— \{X E J : \X\ — k}\, is useful in many situations, e.g. for getting the 
rank selection probabihties of a stack filter [W3] . 

Here comes the section break-up. Section 2 reviews basic material about Horn formulae 
and introduces some convenient set theoretic notation. In particular the if from before 
will be written as E U 0, where E consists of the implications {1,2,3} — > {5,6} and 
{3, 4, 5} — )■ {6}, and consists of the set (1, 3, 6}*. Section 3 explains in an informal way 
how Table 1 was derived from S U 0. The remaining sections more carefully distinguish 
between generating respectively counting models. Section 4 introduces the principle of 
exclusion (POE) which is a novel method for generating all models of a Boolean formula ip 
that comes as a conjunction of subformulas of suitable shape. It was already employed in 
[Wl] but will be discussed in more coherent form here. In Section 5 the POE gets refined 
to the Horn n-algorithm which handles Horn formulae ip. Section 6 aims at generating 
merely the /c-element models of a Horn formula. Section 7 resumes the discussion of the 
POE but now from the counting point of view. Akin to before this gets refined in Section 8 
to counting all /c-element models of a Horn formula. The final Section 9 positions the POE 
among other frameworks such as binary decision diagrams, branch and bound, constraint 
programming. 

2. On Horn formulae and closure systems 
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A clause in prepositional logic is called Horn if it contains at most one positive literal. 
We shall carefully distinguish the two ensuing subcases. Thus, a formula of type 

(NC) ai V 02 V ■ ■ ■ V a„ (n > 1) 

respectively 

(UI) ai V 02 V ■ ■ ■ V a„ V 6 (equivalently : (ai A 02 A ■ ■ ■ A a.„) b ) {n >0) 

will be called a negative clause, respectively unit implication. A Horn formula is any 
prepositional formula that is equivalent to a conjunction of negative clauses and unit 
implications. For the time being we concentrate on (UI) and return to (NC) in a moment. 

It is often convenient to combine unit implications with the same left hand side, for 
instance 

((a Ac)-)- 6) A {{aAc)^d) A ((aAc)-)-e) 

is equivalent to (a A c) — t- (6 A (i A e). If t : {a, b, c, d, e} — > {0, 1} is any function, viewed 
as truth value assignment with = False and 1 = True, then t satisfies (or is a model 
of) a A c — 7- 6 A (i A e if and only if 

t{a) = or t(c) = or t{a) = t{b) = t{c) = t{d) = t{e) = 1 

We shall mostly identify a truth value assignment t with the subset X of variables that 
have t-value 1. For instance, if W is our universe of variables, and {a, b, c, d, e} C W, 
then X CW satisfies {a, c} — > {b, d, e} if and only if {a, c} % X or {a, 6, c, d, e} C X. 
Generally, let A = {oi, ■ ■ ■ , a„} and B = {bi, ■ ■ ■ , b^} be subsets of W. We speak of 

(Im) A ^ B (as formula: (ai A ■ ■ ■ A a„) (61 A • • ■ A bm) ) 

as an implication with premise A and conclusion B. Combining m unit implications of 
type = in (UI) yields 61 A 62 A • ■ ■ A 6^- This amounts to an implication with empty 
premise: 

(Imo) -> {61,62, ■■■ ,&m} 

Be aware that the negative clauses ai V ■ ■ ■ V a„ in (NC) are not dual in any sense to the 
(unique if occuring) positive conjunction 61 A 62 A ■■■ A 6m in (Inio)- Let S be anjj^ family 
of implications Ai ^ Bi. One verifies at once that 

Mod(E) := {X (^W\ X is a model of all members of S} 

is closed under intersections, i.e. 

(1) XnY e Mod(S) for all X,Y e Mod(E). 

^Whereas there is information in </) — > _Bj, there is none in Ai — > 0, and so these can be dropped. 
Furthermore, we henceforth silently assume that premise and conclusion are disjoint since A — > _B is 
equivalent to A ^ {B\A). 
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Moreover, W satisfies all Ai — >■ Bi (since Bi C W), and so 

(2) W e Mod(S), in particular E is always satisfiable. 

By (1) and (2), Mod(S) is a closure system and 

d{U) := f]{X e Mod(S) : U C X} 

is a closure operator on 2^. The T,-closure cl{U) can be computed [MR, Tlim.10.3] in 
time 0(||S|| + w) where w := \W\ and is defined as the sum of the cardinalities of 
all premises and conclusions of implications in S. Let C be any closure system on a set 
W. Then a set E of implications such that C — Mod(E) is called an implicational base of 
C. We recommend [BM] as a survey which also offers a historical perspective including 
the frequent rediscovery of concepts. 

As to negative clauses ai V 02 V • • • V a„, in (NC) we choose the set notation {ai, • • • , a„}*. 
Thus, a truth assignment t satisfies A* {ai, ■ ■ ■ , a„}* if and only if X :— t~^{l) Q W 
is a noncover for A* in the sense that A* ^ X. For any set © of negative clauses A* it is 
clear that 

Mod(e) := {X CW\ X is a noncover for all A* e 6} 
is a simplicial complex, i.e. closed under subsets: 

(3) X e Mod(0) and Y CX, implies Y e Mod(e). 
Note that 

(4) G Mod(0), in particular is always satisfiable. 

Let J be any simphcial complex on W. Then a set of negative clauses A* will be called 
a negative-clause base of if Mod(0) = J^. 

We define a Horn h- formula as a /i-element set EU0 where E consists of implications and 
consists of negative clauses. As opposed to (2) and (4), E U need not be satisfiable. 
Whether or not it is, can be settled with unit resolution; see [SS] for a concise account. 
An alternative method using cl from above is mentioned after the proof of Theorem 2. 

3. The Horn n-algorithm - first serving 

Here we get a first impression of the Horn n-algorithm by working through an ad hoc 
example. That lays the foundation for its detailed description and theoretic evaluation in 
Section 5. Consider e.g. this Horn 3-formula: 

E = {{1,2,3} ^{5,6}, {3,4,5} ^{6}} 

= { {1,3,6}* } 

Generally let = {1, 2, ■ ■ ■ , w} (here w = Q) and Modo := 2^. Yoi I < i < h let Mod^ 
be the family of all subsets X <ZW that satisfy the first i members of E U 0. In particular 
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Mod(E U ©) = Mod/j. The main idea (to be elaborated in Section 4) is to calculate the 
subcoUection Modj+i from Modj by discarding all X G Modj that falsify the {i + l)-th 
component. Any set X G Modj will be represented by its characteristic 0, 1-vector of 
length w, but whenever possible we use the label 2 to indicate that an entry is free to 
be or 1. That is easy for Modo which here is r = (2, 2, 2, 2, 2, 2). In order to represent 
Modi, let us split r into the disjoint union of 

r[n] = {X er: {1, 2, 3} ^ X} and r[l] = {X e r : {1, 2, 3} C X}. 

We can compactly write these as 

r[n] = (n, n, n, 2, 2, 2) 
r[l] = (1,1,1,2,2,2) 

with the understanding that the wildcard nnn means "at least one 0". Hence the letter 
n which stands for nul. All X G r[n] trivially satisfy the implication {1,2,3} — )■ {5,6}, 
but not all X G r[l] do. However, the good X G r[l] are easily pinned down and we get 
Modi as the disjoint union of these "(0, 1,2, n}-valued rows: 
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2 


1 


1 



Notice that all X G (1,1,1,2,1,1) satisfy the second implication {3,4,5} — > {6}, but 
not all X G (n,n,n,2,2,2) do so. In order to pin down the good X's we split the row 
according to the third entry: 

(n, n, n, 2, 2, 2) = (2, 2, 0, 2, 2, 2) U (n, n, 1, 2, 2, 2). 

All X G (2, 2, 0, 2, 2, 2) satisfy {3, 4, 5} ^ {6}, and those X G (n, n, 1, 2, 2, 2) that satisfy 
it are exactly the ones in r2 U rs: 
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2 


2 


r2 = 
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Prom the above it is clear that Mod2 = ri U r2 U U r4. All members of ri satisfy the 
negative clause {1, 3, 6}*, but no members of satisfy it. Hence r4 needs to be cancelled. 
It is immediate that below comprises exactly the good X G rs, and not much harder 
to see that p2 U ps comprises the good X ^ r2- 
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Table 1 
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In summary, 

Mod(E U ©) = Moda = pi U p2 U pa U p4 

As seen in Section 1, all X e Mod(E U ©) can be counted or generated from Table 1 in 
obvious ways. We close this section by giving the formal definition of a {0, 1, 2, n}-valued 
row on a finite set W. It is a quadruplet 

r := (zeros(r), ones(r), twos(r), nbubbles(r)) 

such that 

(5) W is the disjoint union of the sets zeros(r), ones(r), twos(r), nbubbles(r), where 
any one of these may be empty. 

(6) If nbubbles(r) is noncmtpy, then it is a disjoint union of t > 1 many sets nbi, ■ ■ ■ nbt 
(called n-bubbles) such that z/j := \nbi\ > 2 for all 1 < i < t. 

Thus r can be visualized (up to permutation of the entries) as 

(7) r = {0, - ■ , ,1, ■ - , 1, 2, ■ y , 2, ni, ■ - , ni , • • • , n^, • -j , n^) 

a ^ 1 fi vt 

By definition, r represents the family of all sets X <ZW satisfying 

(8) X n zeros(r) = 4> and ones(r) C X and (VI < i <t) nbi % ^■ 

It will however be convenient to identify r with the family of X's satisfying (8). Then, 
clearly, 

(9) |r| = 2^ • (2^1 - 1) ••• (2^' - 1). 

4. The principle of exclusion aimed at generating 

A set W with i/; will be called a w-set. Formally, a constraint on a fixed w-set W is 

a family V C 2^. We say that X C ly satisfies the constraint V iS X eV. Equivalently 
a constraint can be defined Boolean function b : {0, 1}^ {0, 1} in that X CW 
satisfies V if and only if its characteristic vector x satisfies b{x) = 1. Proceeding with 
Boolean function terminology our task below could be defined as a specific constraint 
satisfaction problem (CSP). We touch upon the standard CSP fine of attack in Section 9 
but here we try another approach for which the set theoretic frameword is more convenient. 

The task to find all models X satisfying h given constraints Vi amounts to determine 
7^1 n 7-'2 n ■ • • n Vh- For instance, Vi may be the constraint of being closed with respect 
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to some implication Ai — t- Bi. Starting with the powerset Modo := 2^ the principle of 
exclusiot^ (POE) generates 

Modj+i := {X G Modil Xsatisfies Pi+i} 

by excluding all duds X (i.e. violating "Pj+i) from the family Mod, of "partial models" 
(i.e. satisfying the first i constraints). At the end Mod^ equals ViH- ■ -nVh- All of that is 
only efficient when Modj can be compactly represented as union of disjoint (multivalued) 
rows r. In the worst case r is just a 0, 1-vector but usually r comprises other symbols as 
well. For instance, in section 3 multivalued meant {0, 1, 2, nj-valued. A row of Mod/j is 
called a final row. 

The row collection Modj+i arises from Modj by imposing constraint "Pj+i on each row 
r G Modj. Imposing Pj+i on r happens in exactly one of three ways: 

(10) If no X G r satisfies "Pj+i, then r is deleted. 

(11) Otherwise r can sometimes be promoted to another row r' which comprises exactly 
those X G r that satisfy "Pi+i. We call r' a trivial son of r. In particular, if all 
X G r happen to satisfy Vi+i already, then r' = r. 

(12) If {X G r| X satisfies Pj+i} C r cannot be modelled by a trivial son r', one proceeds 
as follows: 

(12.1) Row r is split into disjoint candidate sons Vj (1 < j < s), i.e. each X G r is 
contained in exactly one Vj. Here 2 < s < cw. 

(12.2) If Tj contains no member satisfying "Pj+i, then vj is deleted. Otherwise rj is 
altered (shrunk) and promoted to a proper son r'j that contains exactly those 
X G Tj that satisfy Vi+i. 

The c in (12.1) is a global constant, i.e. depends only on the type of POE-application. 
For later purpose we define Smax as the maximum s occuring in any fixed concrete run of 
POE. Often cw an be substituted by c, for instance c = 3 in the semigroup application 
of Section 1. If in (12.2) all candidate sons get killed, that amounts to (10). In a good 
use of the principle of exclusion the deletable rows, if any, should be recognized as soon 
as possible in order not to waste time on doomed successor rows. To simplify later proofs 
it is convenient to postulate the following condition which so far always held anyway: 

(13) The imposition of any constraint Vi upon any multivalued row r of length w costs 

4.1 Time assessment 

■^Of course this has got nothing to do with Pauh's famous "principle of exclusion" known from physics. 
The name arose as a contrast to the well known principle of inclusion-exclusion. 
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A multivalued row r is called feasible if it contains at least one model. In other words, 
r n Pi n • • • n 7^ 0. The fact that a feasible row never gets killed will be referred to as 
the consistency of the principle of exclusion. We say that a particular installment of the 
principle of exclusion avoids the deletion of rows if (10) above never occurs. 

One of the benefits of the principle of exclusion is that for any integer k <w and any row 
r the number 

(14) Card{r,k) := \{X e r : \X\ = k}\ 

of A;-element members of r is often easier to calculate than in other computing frameworks 

(Section 9). By focusing on |X| < k respectively |X| > k we similarly define Card(r, < k) 
and Card{r, > k). Thus Card(r, < w) is just |r| which, as in (9), was particularly easy to 
compute in all instances of POE so far. 

We say that a function f{h, w) is at least linear in w if for some constant c > it holds 
that f{h,w) > cw for all positive reals h and w. 



Theorem 1: Let Vl^ be a ■u;-set and let Vi C 2^ be constraints {I <i < h). Fix 
k e {l,2,---,u'}. Suppose some "old" version of the principle of exclusion can be 
employed to produce disjoint multivalued rows whose union is the set of all models. 
Further assume that for some function f{h,w) which is at least linear in w, 
it holds that: 

(a) For each row r it costs time 0{f{h, w)) to decide whether there is a model 
X er with \X\ < k. 

(b) If r is a final row, then it costs 0{Card{r, < k)wf{h,w)) to write down 
(in ordinary set notation) the sets X & r with |X| < k. 

Then the old version can be adjusted to a new one that avoids deleting rows and 
generates the models X CW with |X| < /c in time 0{f{h, w) + Nhwf{h, w)). 



The requirement that all sets must have cardinality < k cannot be treated as some extra 
constraint Vh+i- because it will not be "imposed" the same way as the others. However, 
it is convenient to call r extra feasible if there is a model X G r with \X\ < k. For the 
special case k = w in Theorem 1 "extra feasible" amounts to "feasible" . 

Proof of Theorem 1. The first row always is (2, 2, • • • , 2), i.e. the powerset of W. If it is 
not extra feasible, this can by (a) be detected in time 0{f{h, w)) and then there are N = 
models. That's the only reason why 0{f{h,w) + Nhwf{h,w)) in Theorem 1 cannot be 
replaced by 0{Nhwf{h,w)). An analogous argument in forthcoming theorems will not 
be repeated. 
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So assume that (2, 2, • • • , 2) is extra feasible. We shall argue by induction that the old 
algorithm can be renewed to make all promoted rows extra feasible, and so by consistency 
no promoted row can ever be deleted (having caused, together with its forfathers, much 
useless work). Hence, if R is the number of final rows produced by the new algorithm, 
then the number of occurred impositions is at most Rh (distinct finalized rows possibly 
having some of their h forfathers coinciding). Below we shall show the "core claim" 
namely that the imposition of a constraint Vi upon a row still costs 0{wf{h, w)) with the 
new algorithm. 

The cost of all impositions is thus 0{Rhwf{h,w)). Because the sum of all R numbers 
Card{r, < k) when r ranges over the (disjoint!) final rows is N, it follows from (b) that 
the cost of generating all (< A;)-element models from the final rows is 0{Nwf{h,w)). 
In view of i? < adding up the two costs yields 0{Rhwf{h,w)) + 0{Nwf{h,w)) — 
0{Nhwf{h,w j) as claimed. 

As to the core claim, let r be extra feasible with say Xq e r satisfying all constraints 
Vi {1 < i < h) and having |Xo| < k. By (13) it costs 0{w'^) to impose a constraint Vi 
upon r. If r gives rise to a trivial son r', then by consistency still Xq G r', and so r' 
remains extra feasible. Suppose r gives rise to the candidate sons ri, - ■ ■ ,rs {s > 2). One 
of them, say ri, must contain Xq. Say ri, ■ • • , rj are exactly the extra feasible candidate 
sons. Even the old version of the principle of exclusion by consistency would promote 
ri. • • • . rt to proper sons r[, - ■ ■ , r[. The new version additionally ensures, by testing all 
i^j (1 < j < s) for extra feasibility, that none of r^+i, ■ ■ ■ ,rs gets promoted. By (a) and 
since s < cw hj (12.1), all of that costs 0{w^) + sO{f{h,w)) = 0{w^) + 0{wf {h,w)). 
Because f{h,w) is at least linear in w that reduces to 0{wf{h,w)). ■ 

As to how the intermediate or final rows can be stored economically, see Section 7.1. 

An analogous argument shows that Theorem 1 also holds when < A; is switched to > A;, 
respectively = k, throughout. Of course "old = new" is possible in Theorem 1; then 
simply one algorithm that avoids deletion of rows is assessed. 

Call a multivalued row r weakly feasible if for all 1 < i < /i there is some Xj G r that 
satisfies Vi- Thus "feasible" amounts to say that all (1 < i < /i) can be chosen identical. 
Because not all variants of the principle of exclusion allow a fast feasibility check, weak 
feasibility serves as a substitute: discarding rows which are not weakly feasible puts a lid 
on the number of deletions. All the theorems to come concern only the feasibility of rows, 
but weak feasiblity will feature in our informal Section 9. 

5. The Horn n-algorithm - second serving 

Let us continue on a more technical level the discussion of the Horn n-algorithm begun 
in Section 3, making use of the POE framework displayed in Section 4. We first discuss 
the various cases that arise when an implication or negative clause is to be imposed on a 
row r. Afterwards Theorems 1 will be applied to the Horn n-algorithm. 
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So let B be an implication, where A (the premise) and B (the conclusion) are w.l.o.g. 
nonvoid disjoint subsets of = {1, 2, • • • , w}. It is to be imposed on a {0, 1, 2, n}-valued 
row r indexed by VF, as visualized in (7). 

Case 1: Af] zeros(r) 7^ 0, or A wholy contains a n-bubble, or 5 C ones(r). In this case 
either all X e r have A ^ X, or sll X e r have B C X. Whatever takes place, all X e r 
satisfy A ^ B, and so row r carries over unaltered. Here are three instances of rows r 
that all satisfy {1,2,3} {5,7}. They correspond to the three mentioned subcases: 
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Case 2: A O ones(r) and {B fl zeros(r) 7^ or i? wholy contains a n-bubble). Then 
clearly r needs to be cancelled. 



Case 3: A C ones(r) and B fl zeros(r) = and B contains no n-bubble. Then we can 
switch all entries contained in i? to 1 (while adjusting some others). Using the terminology 
of section 4 we thus obtain a trivial son r' C r that satisfies A ^ B. For instance, for 
{1,9} {3,4,5,6} we get: 
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Given that A C ones(r), either Case 2 or Case 3 takes place. Hence in view of Case 1 
this is the only remaining possibility: 

Case 4: A <^ ones(r), and A fl zeros(r) = 0, and A does not wholy contain a n-bubble, 
and B ^ ones(r). Therefore, putting 

^ones = ^ n ones(r), ^twos = ^ n twos(r), ^nbubbies ^ Ad nbubbles(r), 

one has the disjoint union 

A — ^ones U ^twos U ^nbubbles (^ones 7^ ^) 

In order to impose A ^ B we split r{A — ?> B) := {X G r : X satisfies A — )■ B} as follows: 
r{A^ B) = {X e r : A g X} U {X e r : AU B C X} 



r(diff) 



r(easy) 



From Aones 7^ ^ follows r(diff) 7^ 0, but r(easy) = is possible. The "difficult" task 
will be to represent r(dif f ) as a suitable disjoint union of {0, 1, 2, n}-valued rows. 

To fix ideas, take W ^ {1,2,---, 14} and let ^ ^ S be {1, 2, 3, 4, 5, 6, 7, 8} ^ {12, 13}. 
If r is the top row in Table 2 below, then the parameter i in (5) is t = 4. Furthermore 



Ar. 



^twos — {1) 2}, Anbubbles — {3, 4, 5, 6, 7, 8}. 
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If we write supp(ni) = {3, 4, 9} for our first n-bubble, then supp(ni) fl Anbubbies = {3, 4}. 
Splitting accordingly yields 

r(diff) = {X e r(diff) : {3,4} 2 ^} U {X e r(diff) : {3,4} C X} 

^ V ' ^ V ' 

r[n] r[l] 

In Table 2, notice that nxn\n\ in r becomes nn2 in r[n], and 110 in r[l] (not shown). 
Proceeding likewise with respect to r[l] and supp(n2) = {5, 10} we get r[l] = r[l,n] U 
r[l, 1], and so on. Finally r[l, 1, 1] = r[l, 1, 1, n] U r[l, 1, 1, 1] where 

r[l,l,l,n] := {X e r[l, 1, 1] : 8 ^ X} 

r[l, 1,1,1] := {Xer[l,l,l]:8eX} 

It is clear that r(dif f ) is the disjoint union 

r(diff) = r[n] Ur[l,n] Ur[l,l,n] Ur[l, 1,1, n] Ur[l, 1,1,1] 

As to r(easy), if it is nonempty like here, it can be represented as a single {0, l,2,n}- 
valued row. 



1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


11 


12 


13 


14 




2 


2 


ni 


ni 


n2 




^3 


n4 


ni 


n2 


^3 


^^3 




n4 


r 


2 


2 


n 


n 


n2 


n3 


^3 


n4 


2 


n2 




ns 


n,i 


77.4 


r[n] 


2 


2 


1 


1 





ns 


"'3 


77.1 





2 


ns 


ns 




774 


r[l, n] 


2 


2 


1 


1 


1 


n 


n 


n4 








2 


2 




n4 


r[l, 1, n 


2 


2 


1 


1 


1 


1 


1 













^3 


2 


2 


r[l, 1,1,77] 


n 




1 


1 


1 


1 


1 


1 








n3 






7^4 


r[l, 1,1,1] 


1 


1 


1 


1 


1 


1 


1 


1 











1 


1 





r(easy) 



Table 2 



Our chosen example for Case 4 was "almost typical." Let us indicate the possible slight 
deviations: 



(i) If the conclusion ol A ^ B was B — (13, 14}, nothing would have changed in the 
decomposition of r(diff), but then r(easy) = because n^n^n^ cannot be 111. 

(ii) We had A^nes = 0) but A^nes would merely have resulted in additionally copying 
a bunch of I's in all rows r[7^], r[l, n] up to r(easy). 

(iii) Suppose A ^ B was {3, 4, 5, 6, 7, 8} {12, 13} instead. Then we have A = ^nbubbies 
which entails 

r(diff) = r[n] Ur[l,n] Ur[l,l,n] Ur[l,l,l,7^] (without r[l, 1, 1, 1]!) 
r(easy) = (2,2,1,1,1,1,1,1,0,0,0,1,1,0) 



11 



(iv) Suppose A ^ B was {1, 2} — > {12, 13} instead. Then we have A — >ltwos which 
entails 

r(diff) = (n, n,ni,ni,n2,n3,n3,n4,ni,n2,n3,n3,n4,n4) 
r(easy) = (1,1,^1,^1,712,713,^3,714,711,^2,713,1,1,^4) 

It remains to show how negative clauses A* are imposed upon {0, 1, 2,7i}-valued rows r. 
Matters being similar to the above we can be brief. 

Case 5: A* n zeros (r) 7^ or ^4* wholy contains a 7i-bubble. Then r carries over 
unaltered. 

Case 6: A* C ones(r). Then r needs to be cancelled. 

Case 7: A* fl zeros(r) = and A* does not wholy contain a 7i-bubble and A* % ones(r). 
Then with definitions analogue to case 4 one has 

A* — A* I I 4* I I 4* (A* =L A*\ 

— ^ones ^ ^twos ^ ^nbubbles V^ones T ^ ) 

and one treats r(dif f ) exactly as in Case 4. Note that r(easy) is absent in Case 7. ■ 

The encountered row splitting process is quite visual and invites hand calculations for 
smaller size problems. From case 4 it is clear that for the present application of the 
principle of exclusion the parameter Smax from section 4 is at most the smaller of ^ (since 
t < f in (7)) and max{|yl| : A ^ 5 in E}. Thus it costs 0{w'^) to impose an implication 
of E on r. Ditto by cases 5 to 7 it costs 0{w'^) to impose a negative clause of © upon 
r. Hence (13) is satisfied. Without further mention, it will be satisfied in all upcoming 
theorems as well. Recall the definition of a Horn /i-formula from section 2. 



Theorem 2: Given is a Horn /i- formula S U © on w variables. Then the presented Horn 
7i-algorithm can be adapted to generate the N models in time 0{hw + Nh^w'^). 



Proof: The presented Horn 7i-algorithm needs to be upped from old to new according to 
Theorem 1 with k — w {so extra feasible = feasible). If we manage to satisfy (a), (b) in 
Theorem 1 for f{h,w) := hw (which is at least linear in w) then our 0{hw + Nh^w"^) — 
0{f{h,w) + Nhwf{h,w)) claim will follow. 

Concerning (a), we need to show that the feasiblity of a row r can be tested in time 
0{f{h,w)). For each y C 1/F let F be the E-closure of Y, i.e. the smallest superset 
of Y that satisfies all implications of S. As seen in Section 2, it costs 0(||E|| + w) = 
0{hw + w) = 0{f{h,w)) to compute Y. To check the feasibility of r, put Y := ones{r). 
If y is a noncovcr for all A* in (testable in time 0{hw) = 0{f{h,w)), then F is a 
(E U 0)-model, i.e. feasible. But is F G r? This amounts to check, again in time 0{hw), 
whether Yf] zeros(r) = and whether Y doesn't contain any 7i-bubble of r. It remains 
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to show that when X G r is any (E U 0)-model, then so will be Y. Indeed, since X = X 
(being a S-model), it follows from Y = ones(r) C X that Y (1 X. Therefore, with X 
also y is a noncover for all A* in Q. To summarize, we have shown: 

(14) r is feasible if and only if first F is a noncover for all A* in 0, second YD zeros(r) = (j), 
third Y doesn't contain an n-bubble of r. This feasibility test costs 0{f{h,w)). 

As to (b), it is straightforward to write down all |r| sets of a {0, 1, 2, n}-valued row r in 
some systematic way; for instance r = (0, 1, 0, n2, ni, ni, 1, can be handled "lexico- 
graphically" : 

{2,7}U{}U{}, {2,7}U{ }U{4}, {2,7}U{}U{8} 
{2,7}U{5}U{ }, {2,7}U{5}U{4}, {2, 7} U {5} U {8} 
{2,7}U{6}U{ }, {2,7}U{6}U{4}, {2, 7} U {6} U {8} 

Hence generating all members of a row r costs 0(|r|t(;) = 0{Card{r, < w)wf{h,w)) as 
required. ■ 

As to the proof of (a), if r = (2, 2, ■ ■ ■ , 2), then Y is the closure of the empty set, and the 
feasbilitiy of r amounts to the satisfiability of S U 0. 

In practise most problems are "homogeneous" in that either S or is empty. If = 0, 
then only cases 1 to 4 apply and we speak of the implication n-algorithm. If S = then 
only cases 5 to 7 apply and we speak of the noncover n-algorithm. Its dua|^ version is 
called transversal e-algorithm [W2]. Applications, refinements and numerical evaluations 
of the implication ra-algorithm are work in progress. 

6. Generating all Horn-models of fixed cardinality 

The naive approach to fc-element models is to retrieve (i.e. generate or count) form each 
final row r all X G r with \X\ = k. Trouble is r may contain no such X and should have 
been deleted long ago. Whether avoiding deletions in practise is worth the effort, depends 
on the sitation, but in order to get theoretic results the deletion of rows must be ruled 
out. As seen, this is the task of Theorem 1. We additionally need the following fact: 

(15) [W2, Thm.5] Let r be a {0, 1, 2, n}- valued row. It costs time 0{w^Card{r, k)) to 
generate, i.e. write down in set notation, the sets X G r with \X\ = k. 

''Notice that X is a noncover oi Al, ■ ■ ■ , A^^ if and only if its complement X"^ — W\X is a. transversal 
in the sense that X'^ n A* ^ for all 1 < i < h. Albeit the noncover rt-algorithm can thus generate 
transversals, it pays to introduce the symbolism ee - ■ ■ e :— "at least one 1" and a corresponding transversal 
e-algorithm which produces the transversals "directly", not as X"^. See also the last paragraph in Section 
9.4. 
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Theorem 3: Let S U be a Horn /i-formula on w variables, and k < w a, fixed 

integer. Then the many models with |X| < /c can be generated in time 0{hw + Nh^w"^). 



Proof. Again we verify (a), (b) in Theorem 1 for f{h,w) :— hw. 

As to (a), one checks that r is extra feasible if and only ii Y = ones(r) satisfies the 
conditions stated in (14) and additionally \Y\ < k. (Notice that from |X| < k and Y C X 
follows |F| < k). The cost stays 0{hw). 

As to (b), by (15) it costs 

0(w2Card(r, 1)) + 0{w^CeiTd{r, 2)) + • • • + 0(w2Card(r, k)) 

= 0(^/;2Card(r, < k)) = 0(Card(r, < k)wf{h,w)) 
to generate all (< A;)-element members of r. ■ 

Recall that Theorem 1 still holds when < A; is replaced by = A; throughout. Trouble is that 
in our "Horn situation" checking extra feasibility in condition (a) of Theorem 1 then gets 
more expensive. Putting it bluntly, as opposed to the proof of Theorem 3 the existence 
of a (E U 6)-model X e r with \X\ = k does not imply that \Y\ — k. 

In the present Section 6 we only tackle a special case (Theorem 4) and postpone the naked 
|X| = A: to Section 8 which is dedicated to counting, not generating. The special case is 
such that h < w and k <w — h. Furthermore we focus on noncovers rather than arbitrary 
Horn fomulae. 



Theorem 4: Given are h subsets A* of a w-set W. Assume that h <w and fix a non- 
negative integer k <w — h. Then the many /c-element noncovers X of the set system 
{^4*, • • • , Al} can be generated in time 0{hw + Nh^w^). 



Observe that naively testing all A;-element subsets of W costs 0{(^)hw) = Oiw^^"^) which 
other than 0{Nh'^w'^) does not involve the possibly small number N. 

Proof of Theorem 4- We shall verify (a),(b) for f{h,w) = hw in the (= /c)-version of 
Theorem 1. Again (b) holds as in the proof of Theorem 3. 

It remains to show (a) i.e. that for any {0, 1, 2, n}- valued row r its extra feasibility can 
be tested in time 0{f{h,w)). Say the impositions of ^j+i, " " " > upon r are still 
pending. If one of these sets is contained in oncs(r), or if |ones(r)| > fc, then r is not extra 
feasible. Testing this costs 0{hw) = 0{f{h,w)). Conversely, suppose that |oncs(r)| < k 
and that Si := A* \ ones(r) is nonempty for all j + 1 < i < h. Looking at cases 1 to 7 
in Section 5 one sees that with each imposition of a constraint the number of n-bubbles 
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in a son increases by at most one (in case 4 this number even decreases in many sons). 
It follows that our row r contains at most j n-bubbles. Say w.l.o.g. there are exactly j 
of them. To match previous notation, call them Si, - ■ ■ , Sj. Take any transversal T of 
{Si, ■ ■ ■ , Sh} with |T| < h. A minute's reflection shows that X := W — T is a. non cover 
of {Al, ■ ■ ■ , Al} and that X er. In view of ones(r) C X and 

|ones(r)| < k < w — h < \X\ 

we can extend ones(r) to any /c-element subset Xq of X. Then still Xq e r, and Xq a 
fortiori is a noncover of {A^, ■ ■ ■ A^}. Hence r is extra feasbile. ■ 

7. The principle of exclusion when aimed at counting 

Often one only needs to count rather than generate all models. Below Theorem 1 is 
accordingly adapted. As in the proof of Theorem 1, we let it! be the number of final rows 

produced by the POE. Admittedly the only apparent theoretic upper bound of R is N 
but we stick to R to emphasize that in practice R often is much smaller than N (see [W2] 
for experiments on random problems). Like Theorem 1, also Theorem 5 holds when < k 
is replaced hy > k or — k throughout. 



Theorem 5: Let be a w-set and let Vi C 2^ be constraints. Fix k G {1, 2, ■ ■ ■ ,w}. 
Suppose some "old" version of the principle of exclusion can be employed to produce disjoint 
multivalued rows whose union is the set of all models. Further assume that for some 
function f{h,w) which is at least linear in it holds that: 

(a) For each row r it costs time 0{f{h,w)) to decide whether r is extra feasible 
in the sense of containing models X with \X\ < k. 

(b) If r is a finalized row, then it costs 0{wf{h,w)) to calculate Card{r, < k). 

Then the old version can be adjusted to a new one that avoids deleting rows and 
calculates the number of models X <ZW with |X| < k in time 0{f{h,w) + Rhwf{h,w)). 
Here R< N is the number of final rows produced by the new algorithm. 



Proof. The conditions in Theorem 5 arc the same as in Theorem 1, except in (b) we 
have 0(1 ■ wf{h,w)) instead of 0{Card{r, < k) ■ wf{h,w)). Since only (a) was used 
in Theorem 1 to establish the cost of all impositions as 0{Rhwf{h,w)), that's also the 
correct corresponding cost in Theorem 5. As to the cost of counting all (< A;)-element 
models from the final rows, by (b) in Theorem 5 it costs 0{Rwf{h,w)) to compute the 
R numbers Card(r, < k). Adding up these numbers (which in base 2 have length < w) 
yields A^ and costs 0{Rw). Hence the total cost is 

0(Rhwf(h, w)) + 0(Rwf(hw)) + 0{Rw) = 0(Rhwf(hw)) 
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7.1 Space assessment 



Concerning time assessment, Theorem 5 is the twin of Theorem 1. Here we show that for 
the counting POE the required space can be reduced, in fact it doesn't even depend on 
the number N of models. 

Rather than calculating the often large row collections Modj stepwise for i = lnptoi = h 
(as we did in Section 3 for ease of visualization), it is better to employ a well known last 
in first out (LIFO) stack management. That is, each row vj carries a pointer PC{rj) (= 
pending constraint), and throughout only the top row Vj of the working stack is updated. 
Specifically, if PCijj) = k then constraint Vk is imposed upon rj. This triggers the 
(trivial or proper) sons rj+i,rj^2, ■ ■ ■■ They are put on the working stack in place of rj, 
with corresponding pointers PC set on k + 1. Whenever a row is finalized, i.e. Vh has 
been imposed on it, it is moved from the working stack to the final stack. For instance, 
using LIFO the imposition of /i = 4 constraints Vi upon ri = (2, 2, ■ ■ ■ , 2) may begin as 
follows: 



PC{n) = 1 



PC{rs) = 2 



PC(r2) 





= 3 


PC{r,) 


= 3 


PC{r,) 


= 3 


PCir^) 


= 2 



— > 



PCir-r) 


= 4 


PC{r,) 


= 3 


PCin) 


= 3 


PCir^) 


= 2 



Fig. 1 

If imposing P4 = Vh upon results in (say) the proper sons pi,p2,P3, the latter are the 
first members of the final stack, and one proceeds by imposing V3 upon r^,. 




row 4 




row 5 





row 6 



I row 7 



The last working stack in Fig. 1 matches the rooted tree in Fig. 2 in that its four 
rows bijectively correspond to the leaves. It is clear that a rooted tree with maximum 
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down-degree Smax and h levels has at most {h — l)(sinax — + h < hsmax leaves. 



Theorem 6: Suppose the principle of exclusion uses LIFO to enumerate 

(as opposed to generate) all subsets of a w-set that satisfy h properties Vi. Let s 

be as in Section 4. Then the whole algorithm requires space 0{sraax.hw). 



Proof: As seen, using LIFO the working stack can increase to size at most /iSmax- Since 
each multivalued row in the working stack requires space 0{w), and since the final stack 
remains empty (final rows r are thrown away after recording |r|), the claim follows. ■ 

The actual row manipulations performed by the principle of exclusion (e.g. the number 
of deletions) are the same whether or not LIFO is used; what differs merely is the space 
required. Of course the LIFO-stack management is also practical for generating Horn- 
models (Sections 5,6), but Theorem 6 has no twin in that context. 

8. Counting all Horn-models of fixed Ccirdinality 

We shall transfer Theorem 3 and Theorem 4 to the counting-framework as Theorem 7 
and 8. Additionally two more theorems are stated. As to the repeatedly used expression 
"provided N > 0" , see the beginning of the proof of Theorem 1. Among the four counting- 
theorems in this section the first is about (< A;)-element models, and the other three about 
/c-element models. Besides Theorem 5 we shall need this twin of (15): 

(16) [W2, Thm.4] Let r be a {0, 1, 2, n}-valued row. It costs time 0{kw^) to compute 
the k numbers Card{r, 1), • • • , Card{r, k). 

In Theorem 7, 8, 10 we let again R < N he the number of final rows produced by the 
Horn n-algorithm. 



Theorem 7: Given is a Horn /i-formula E U © on variables, and a fixed integer 

k < w. Then the many models X with |X| < A; can be counted (provided A^ > 0) in time 

0{Rkh?w^) = 0{Nkh?w^). 



Proof. It suffices to verify (a), (b) in Theorem 5 for f{h,w) :~ khuP, because then A^ can 
be calculated in time 

0{Rhwf{h,w)) = 0{Rk^h^w^) 

In the proof of Theorem 3 we saw that the extra feasiblity of a row r can be checked in 
time 0{hw) = 0{f{h,w)), and so condition (a) holds. As to (b), by (16) the calculation 
of Card(r, < A;) = Card(r, 1) H h Card(r, k) costs 0{kw^) = 0{wf{h, w)). ■ 
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Theorem 8: 

Given are h subsets A* of a w-set W such that h < w. Fix a non-negative integer 

k < w — h. Then the number iV of fc-element noncovers of the set system {Al, ■ ■ ■ , A'^} 

can (provided N > 0) he calculated in time 0{Rkh^w^) = 0{Nw^). 



Proof. If we manage to verify (a), (b) for f{h,w) := khw^, in the (= /c)-version of 
Theorem 5, the 0{Rhwf{h,w)) = 0{Rkh?w^) claim will again follow. We have (a) 
because of 0{hw) = 0{f{h,w)). As to (b), by (16) calcualting Card(r, k) costs 0{kw^) ~ 
0{wf{h, w)); unfortunately Card(r, k) isn't cheaper than Card(r, < k) before. ■ 

A natural idea is to calculate the number of models X with |X| = k as N = N' — N", 
where A^' and A^" are the easier numbers of models of cardinality < k and < (A; — 1) 
respectively. Unfortunately, albeit unlikely in practise, N' and N" may grow exponentially 
with respect to N. Nevertheless, the idea can be saved in this form: 



Theorem 9: Given is a Horn /i- formula on w variables and an integer k < w. 
Suppose it is known that the number of A;*-element models increases as k* increases 
from 1 to k. Then the A" models X with \X\ — k can (provided A" > 0) be counted in time 
0{Nk'^h?w^2) = 0{Nh'^w^). 



Proof. Let A^' and A^" be as above. Because of our assumption about increasing /c*-levels, 
we have A"' < Nk and N" < N{k-1). Prom A^ = A^' - A"" and Theorem 7 hence follows 
that calculating A^ costs 0{N'kh?w^) + 0{N"kh^w^) = 0{{Nk)kh?w^). ■ 

Our last theorem confronts the naked |X| = /c condition. To prepare for it, consider this 
{0, 1, 2, n}-valued row r of length thirteen: 





1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


11 


12 


13 


r = 





ni 




n-i 






1 


2 






^3 




1 


Tq = 





n 


n 


1 


1 





1 


1 





2 


2 


1 


1 



Let's calculate the number Nq of X e r that satisfy neither of the implications {4, 5} — > 

{9} and {5,7,8} — )■ {1}, nor the negative clause {4,8,12}*, and that have cardinality 
|X| 7^ 8. The failure of the three formulae forces the boldface entries in row r^; they in 
turn trigger all the further differences between rg and r. Since |oncs(ro)| = 6 the number 
of X G To with |X| = 8 is the number of ways to place exactly two Ts upon nn22. Ad hoc 
this number evaluates to 5, and so A^o = I'^ol — 5 = 7. Notice that the argument above 
necessitates unit implications, i.e. having singleton conclusions. 
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Theorem 10: Given is a Horn /i-formula S U on -u; variables such that S consists 
of unit imphcations. For any integer k < w the N many models X with \X\ = k 
can (provided iV > 0) be counted in time 0{R2^hw*k) = 0{N2^hw^). 



Since each implication A B can be split into \B\ unit implications, Theorem 10 really 
handles arbitrary sets of Horn formulae. The factor 2^ doesn't look so ugly if one recalls 
that naively checking all /c-element sets costs 0{w'^^'^). For instance, letting h = aw 
(a > fixed) and k = j {P > 1 fixed) one has 2^/w^'^'^ ^ as w ^ oo. 

Proof of Theorem 10. We put f{h,w) = k2^w^ and verify (a), (b) in the (= A';)-version of 
Theorem 5. The claim then follows from 0{Rhwf{h,wj) = 0{R2^hw'^k). As to (b), it is 
satisfied since by (16) calculating Card(r, /c) costs 0{kw^) = 0{wf{h,w)). 

As to (a) , let aj be the property of any Y & r that it satisfies the j'-th component formula 
in E U 9 {1 < j < h). Further let ah+i be the property that |F| = /c. If N{ai, ■ ■ ■ , a/i+i) 
is the number of F G r satisfying all properties, then r is extra feasible if and only if 
N{ai, • • • , tth+i) > 0. The latter number by inclusion-exclusion is 

N{ai,- ■ ■ ,ah+i) = \r\-N{-ai) A^(aft+i) + A^(ai, 02) H \- N{ah,ah+i) 

-N{ai,a2,as) ± N{ai,a2, ■ ■ ■ ,0^+1) 

where e.g. N{a^,ah^i) denotes the number of F G r that violate properties 03 and 
a/j+i. As seen in the example, each summand A^o involving a/j+i can be calculated as 
-^0 = ko| ~ Card{ro, k) for some immediately derived row Tq. If a/j+i is not occuring, the 
calculation boils down to A^o = ko|- By (16) computing Card{ro, k) costs 0{kw^), and so 
calculating N{ai, • • • , ah+i) costs 0{2^kw^) — 0{f{h, w)). ■ 

Using again f{h,w) = k2^w^ but Theorem 1 instead of Theorem 5 one immediately 
derives that 0{Nhwf{h,w)) = 0{N2^hw^) is the cost when "counting" is substituted by 
"generating" in Theorem 10. 



9 Positioning the principle of exclusion 

As seen, the POE is concerned with the models of Boolean functions h : {0, 1}^ — > {0, 1} 
when h is given as a conjunction of h suitable subformulae. Usually we employ set theoretic 
terminology and thus speak of h constraints Vi Q 2^. More about "suitable" in Section 
9.4. The forthcoming comparison of POE with other methods is preliminary and is cut 
along these basic tasks concerning NP-hard problems: 

• Count all models or all /c-element models 

• Generate all models 
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• Find one best (or all best) models with respect to a target function f{x) 

9.1 Counting all models or all /c-element models 

The POE struggles to compete with binary decision diagrams (BDD), now incorporated 
in Mathematica 8.0, when it comes to counting all models of the Boolean function b : 
{0, 1}"' — > {0, 1}. In brief, the BDD associated to b{x) is a certain directed graph with 
among other nodes a root of indcgrcc zero and two endnodcs True and False of outdcgrcc 
zero. Except for the endnodes all nodes have exactly two outgoing arcs labelled and 1. 
Any bitstring x G {0, 1}"' triggers an obvious directed path that starts at the root and 
ends at either True or False, depending on whether b{x) = 1 or b{x) — 0. Starting at the 
bottom of the BDD, it is well known and easy to recursively compute the exac^ probability 
p that a random bitstring x G {0, l}"" triggers a path that ends at True. It follows that 
|Mod/i| can be calculated with lightning speed as p2^. Faster still is it to decide whether 
or not some Boolean function is merely satisfiable. (Something good comes out of that 
for POE. Namely, a multivalued row r often readily spawns a Boolean function br{x) 
such that the feasibihty of r amounts to the satisfiability of br{x). This needs further 
exploration.) 

The BDD approach nevertheless looses out on POE regarding counting models of fixed 
cardinality k (concerning the relevance of this task, see [BEHM]). The reason is that e.g. 
setting up in DNF a Boolean function /3 : {0, 1}^^ — > {0, 1} which is True exactly on the 
bitstrings of weight k = 12 already causes a memory complaint. Let alone building a 
BDD for the desired compound Boolean function b{x) A /3(x); more details in [W2]. 

9.2 Generating all models 

All models X e Mod/i can sometimes be generated one by one with "polynomial delay" , 
e.g. by so called combinatorial Gray-codes [S]. Closer to the compact encoding of Mod/^ 
achieved by the POE, is the BDD and the 0, 1 integer programming (OlIP) framework. 
Indeed, both are fit to yield Mod/i in LIFO fashion (thus no space problem) as a disjoint 
union of {0, 1, 2}- valued rows. But the POE, due to its use of additional symbols (say 
n), is more flexible and hence tends to produce much fewer multivalued rows. Ditto 
deletions of branches in the 01 IP search tree are more frequent than in the POE search 
tree. Interestingly, in the case of a BDD the (albeit many) {0, l,2}-valued rows can be 
obtained without deletions [A, p. 22]. Different from Theorem 1 and 5 this doesn't yield 
theoretic results since the cost of constructing the BDD itself is usually impossible to 
assess; a notable exception is [Be]. 

Returning to the search trees of OlIP and POE, their main difference is not the frequency 
of cutting branches, nor the fact that one is binary and the other (usually) not, but the 
cause of branching in the first place. While OllP-branching is due to setting variables Xi 
to or 1, the POE-branching is triggered by imposing new constraints upon multivalued 
rows. In this regard the POE bonds more with constraint programming (CP) than with 
BDD or OlIP. While the POE so far has all variables Xi in {0, 1}, in constraint satisfaction 
problems [FA, ch.l2] the XiS can assume values from larger but finite domains Di (1 < 
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i < w). However, there is no reason to stick with {0, 1} in future apphcation of POE. A 
more teUing difference between CP and POE is this: Upon applying so called constraint 
propagation the domains shrink to D'^ C Di (if some Di = 0, the problem is infeasible). 
Different from "POE constraint propagation" which concisely yields Mod/j, CP constraint 
propagation only delivers Mod/j as an (using universal algebra talk) unknown subdirect 
product oi D[ X ■ ■ ■ X D'^. 

9.3 Finding one best or all best models 

If only one best model (w.r. to f{x)) needs to be found, then all of Mod^ needs to be 
filtered anyway, and so the subdirect product complaint evaporates. Besides explicite 
enumeration and checking of models (say naively or with combinatorial Gray codes), 
the only known techniques to solve a NP-hard optimization problem exactly are branch 
and bound or dynamic programming. The POE in optimization mode falls under the 
branch and bound hat, along with OlIP and CP. What we called (weakly) feasible is an 
essential notion in any branch and bound algorithm. It is illustrative to contrast our 
weak feasibility with the CP concept of arc consistency [FA]. Both concepts refer to an 
individual constraint, but CP is about existence of variables in D[ while POE is about a 
whole multivalued row. 

When it comes to the approximate solution of NP-hard optimization problems, other 
techniques enter the stage: Simulated annealing, tabu search, genetic algorithms. Genetic 
algorithms bear a vague resemblance to POE in that a whole "population" of models is 
kept alive, but otherwise they differ a lot from POE. Of course also branch and bound 
can be switched to approximation mode: If f{x) needs to be maximized, choose t e M 
and apply branch and bound to either find some xq with f{xo) > t, or to conclude that 
there are no such xq. It has been pointed out that POE branch and bound easily yields 
all Xq with f{xo) > t, whereas OllP-branch and bound is hard pressed to do the same. 

9.4 Two final remarks 

First, the POE can also be applied to disjunctions as opposed to conjunctions of prop- 
erties. Specifically, in order to find the cardinality N (not the members themselves) of 
the set Pi U ■ ■ ■ U Vh, apply the standard POE to the negated constraints and get 
N = 2'^ — n ■ ■ ■ nV^\. For instance, the cardinality of a simplicial complex can thus 
be determined from its facets. 

Second, what "suitable subformulae" meant at the beginning of Section 9, is merely that 
they allow a POE implementation that is efficient in practise, whether or not that can be 
backed theoretically. For instance in [Wl] the suitable subformulae nicely match the stars 
of a graph (all edges incident with a vertex form a star) but a theoretic assessment of 
the resulting swift algorithm eluded the author. If the suitable subformulae are negative 
clauses or implications, then theory (present article) and practise (e.g. [W2]) go hand in 
hand. 

Recall that every Boolean function is equivalent to a conjunction of clauses, but different 
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from Section 2 such a clause can have more than one positive hteral; say aiVa2 Voa Va4Va5. 
Nevertheless, being equivalent to (oi A 02) — )■ (03 V 04 V 05), we may view it as an 
{A,\/)-implication, and observe that its models are contained in the disjoint multivalued 
rows (n,n,2,2,2) and (l,l,e,e, e). (The dual e-symbolism was explained in Section 6.) 
Simultaneous occurence of n, e complicates a theoretic treatment, yet the resulting POE 
implementation performs well in practise (work in progress). 

Acknowledgement: I am grateful to Egon Balas for insightful comments, and to an 
anonymous referee for a quality of constructive criticism that I haven't experienced in a 
long time. 
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